feat: ship Opus 4.6 + Sonnet 4.6 as default model config with 200K context#12
feat: ship Opus 4.6 + Sonnet 4.6 as default model config with 200K context#12royosherove wants to merge 176 commits intomainfrom
Conversation
Updated all 6 files in bootstraps/optional/ and bootstraps/telegram/: - BOOTSTRAP-GITHUBACTION-CODE-REVIEW: marked agent-agnostic - BOOTSTRAP-PIPELINE-NOTIFICATIONS: added Hermes event/webhook injection - BOOTSTRAP-WEB-UI: added Hermes Open WebUI + API server alternative - OPTIMIZE-TOO-LARGE-CONTEXT: added Hermes compression/limits config - BOOTSTRAP-TELEGRAM: added Hermes gateway setup, .env config, pairing - BOOTSTRAP-TELEGRAM-GROUP: added Hermes group config via config.yaml All files now have 'Applies to: All agents' header and agent-specific sections where config differs.
…bootstrap Incorporates the improvements from PR #3 by @gilinachum, applied to the current multi-agent file structure (bootstraps/essential/, bedrockify on port 8090, OpenClaw/Hermes sections). - Step 5: backfill existing memory with 'openclaw memory index --force' - Memory quality section: heartbeat noise degrades vector search - Rule: only write what changed (decisions, bugs, alerts) - Exclude pattern config for heartbeat files Co-authored-by: Gili Nachum <gilinachum@users.noreply.github.com>
…nd ironclaw - Installer reads packs/registry.yaml to build the selection menu dynamically - No more hardcoded pack list — new packs auto-appear by adding to registry - Experimental packs show (experimental) tag in yellow - Instance size defaults read from registry per pack - Deploy templates (CFN, SAM, Terraform) updated to accept pi and ironclaw - Registry entries added for pi (experimental) and ironclaw (experimental)
Single static Rust binary from NEAR AI. Uses openai_compatible backend pointed at bedrockify — bypasses NEAR AI OAuth entirely. Pre-built arm64 (musl) binary from GitHub releases. Known issue: dbus/keyring on headless EC2 — mitigated with IRONCLAW_DISABLE_KEYRING=1 in .env.
…files Code review findings: - CRITICAL: import yaml requires PyYAML which isn't in Python stdlib. The installer runs on user machines (laptops, CloudShell) where PyYAML may not be installed. Replaced with a simple regex-based parser using only stdlib (re module). - Removed ironclaw pack files that leaked from parallel sub-agent work. Those belong on feature/pack-ironclaw only.
Code review finding: IRONCLAW_DISABLE_KEYRING is not a real IronClaw env var — it was guessed. IronClaw uses secret-service/zbus for OS credential store but the LLM_BACKEND=openai_compatible path bypasses it. Replaced with accurate comment about the dbus situation.
…ithmetic errors Code review finding: if user enters non-numeric text (e.g. 'abc' or empty), bash arithmetic $(( pack_choice - 1 )) would error on some shells. Now strips non-digit chars and defaults to 1.
collect_config() runs before prepare_repo(), so CLONE_DIR is empty and the registry file doesn't exist locally. Now falls back to fetching from the GitHub raw URL. Also guards both Python calls against empty registry path.
…olation Code review MEDIUM findings: - $registry path interpolated in Python open() breaks if path has quotes - $PACK_NAME interpolated in Python string is fragile injection risk Both now passed as sys.argv[1]/sys.argv[2] instead.
…ack tests - Extract bedrockify health check into shared check_bedrockify_health() helper in common.sh (DRY: same pattern was inlined in every pack) - Refactor ironclaw/install.sh to use the shared helper - Add packs/ironclaw/test.sh with 35 offline tests covering: - manifest.yaml structure validation - Architecture detection (aarch64/arm64/x86_64/unsupported) - Download URL construction - .env config generation (LLM_BACKEND, model, port, no NEAR AI tokens) - Temp dir cleanup on failure/success paths - common.sh integration (shared helper used, inline code removed) - Shell profile resources - Script basics (strict mode, sourcing, idempotency docs) - Live environment tests (skipped when ironclaw/bedrockify unavailable)
- Add packs/pi/test.sh (41 tests): manifest validation, install.sh interface, models.json generation with various model ID formats, shell-profile.sh variable checks, idempotency patterns - Fix shell-profile.sh: add comment header, PACK_ALIASES, expand PACK_BANNER_NAME and PACK_BANNER_COMMANDS to match hermes/openclaw pack conventions
Discovers tests at: - packs/*/test.sh (per-pack tests) - tests/test-*.sh (project-level tests) Runs each in a separate matrix job (fail-fast: false). Triggers on pushes to main and feature branches when packs/tests change.
- Extract inline Python YAML parser from install.sh into scripts/parse-registry.py - Eliminates DRY violation: both pack listing and instance_type lookup now call the same script with different arguments (list-agents vs get) - Parser has full docstring explaining the state-machine approach - install.sh updated to call the external script instead of inline Python - Add tests/test-registry-parser.sh with 37 test cases covering: - Real registry parsing (all 4 agent packs discovered) - Experimental flag detection - Instance type lookup for all packs - Arbitrary key retrieval - Missing pack/key returns empty - Minimal fixtures, edge cases (empty file, no agents, missing fields) - Special chars in descriptions (dashes, parens, pipes) - Single-quoted YAML values - Malformed YAML (no crash) - CLI error handling (missing args, bad command)
- Converted registry.yaml → registry.json - json_field() and url_encode() now use jq instead of python3 - Registry parser replaced: Python regex script → 3-line jq queries - Removed scripts/parse-registry.py (no longer needed) - Terraform unzip fallback: busybox/jar instead of python3 - Preflight: require jq instead of python3 - Tests rewritten for jq-based parsing (31 pass) python3 is no longer needed on the user's machine. Dependencies: bash, curl, aws-cli, jq.
…n, CI workflow, remove python3 dep
- Pack table updated with all 4 agent packs - Hermes marked experimental in registry.json and registry.yaml - Available Packs section updated with Pi and IronClaw descriptions
Simple mode and advanced mode now default to CFN CLI instead of Terraform. Advanced mode menu lists CFN CLI first and pre-selects it. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The .env file (DATABASE_URL, LLM_BACKEND, etc.) was only loaded by the systemd service via EnvironmentFile. Running `ironclaw` interactively had no DATABASE_URL, causing "No database connection" in the setup wizard. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The onboard wizard fails with "No database connection" even when PostgreSQL is running. The actual agent works fine with --no-onboard. Alias `ic` now runs `ironclaw run --no-onboard`. Original wizard available as `ironclaw-wizard`. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Use LLM_BACKEND=bedrock instead of openai_compatible through bedrockify. IronClaw's native Bedrock provider uses the AWS SDK Converse API directly via EC2 instance profile credentials, eliminating the bedrockify dependency.
IronClaw expects BEDROCK_MODEL and BEDROCK_REGION, not LLM_MODEL and AWS_DEFAULT_REGION. Also strip cross-region prefix (e.g. "us.") from model ID and set BEDROCK_CROSS_REGION separately.
The pre-built ironclaw release binary is not compiled with --features bedrock, so native LLM_BACKEND=bedrock fails at runtime. Revert to openai_compatible backend routing through bedrockify until upstream ships a bedrock-enabled binary.
Bedrockify exposes models as 'anthropic.claude-sonnet-4-6' without the cross-region 'us.' prefix or '-v1' suffix. The previous default 'us.anthropic.claude-sonnet-4-6-v1' caused 400 Bad Request errors.
Each pack now declares PACK_TUI_COMMAND in its shell-profile. The SSM session document, completion screen, and bootstrap echo all use this variable instead of hardcoded 'openclaw tui'. Examples: - openclaw pack → 'openclaw tui' - claude-code pack → 'claude' - hermes pack → 'hermes' - kiro-cli pack → 'kiro-cli'
- SSM doc name now pack-specific (Loki-Session-{pack}) to avoid
cross-pack collisions in same account/region
- deploy_console() loads pack profile before displaying TUI command
- Profile sourcing uses grep/subshell extraction instead of raw source
to avoid executing arbitrary code on installer host
- kiro-cli profile guards interactive code with [[ $- == *i* ]]
- JSON construction uses jq instead of hand-escaping
- Generic fallback (bash --login) instead of hardcoded openclaw tui
- Updated docs: README.md and wiki to show openclaw tui
Adds optional bootstrap that guides agents through setting up the aws-devops-agent skill for querying AWS DevOps Agent via boto3. Key learnings encoded: - send_message only in boto3, not CLI - list-chats CLI has datetime parsing bug - EventStream response with final_response blocks - Multi-turn via executionId reuse - Monthly quota tracking
…32K cap (#10) Bedrock auto-discovery defaults to 32K contextWindow for discovered models. The bootstrap was only setting agents.defaults.model.primary without registering explicit model entries, so Opus 4.6 (200K context) was silently capped at 32K, causing 'context limit exceeded' errors. Now the config patch includes explicit model entries with contextWindow: 200000 for both Opus and Sonnet, alongside the existing default model + heartbeat config. Closes #9 Co-authored-by: Roy Osherove <575051+royosherove@users.noreply.github.com>
…ntext - config-gen.py: add explicit model entries with contextWindow: 200000 instead of empty models array that relied on Bedrock discovery (32K default) - Switch all defaults from us. to global. prefix for cross-region routing - Opus 4.6 as primary, Sonnet 4.6 as fallback and heartbeat model - Updated: config-gen.py, install.sh, manifest.yaml, template.yaml, bootstrap.sh New deployments get correct 200K context out of the box — no manual bootstrap step needed.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: dccd258af2
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| ] | ||
|
|
||
| # Default to global. prefix for cross-region routing; respect user-provided model ID | ||
| default_primary = model if ("." in model and "anthropic" in model) else "global.anthropic.claude-opus-4-6-v1" |
There was a problem hiding this comment.
Preserve user-specified Bedrock model IDs
The new default_primary guard silently discards any --model value that is not Anthropic-shaped, which is a regression from prior behavior where the provided model ID was always used. For example, passing meta.llama3-70b-instruct-v1:0 now produces amazon-bedrock/global.anthropic.claude-opus-4-6-v1 as primary, so deployments can run a different model than requested (with different output/cost characteristics) without any warning.
Useful? React with 👍 / 👎.
What
New deployments now get correct model config out of the box — no manual bootstrap step needed.
Changes
config-gen.py(core fix)"models": []with explicit Opus 4.6 + Sonnet 4.6 entries includingcontextWindow: 200000us.fallback/heartbeat refs toglobal.prefix--modelis respected if it contains a valid Anthropic model IDinstall.sh/manifest.yaml/template.yaml/bootstrap.shus.anthropic.claude-opus-4-6-v1toglobal.anthropic.claude-opus-4-6-v1Why
Bedrock auto-discovery uses
defaultContextWindow: 32000for all discovered models. The old config shipped an empty models array and relied entirely on discovery, so Opus 4.6 (200K context) was silently capped at 32K.The
global.inference profile routes across all AWS regions automatically — better availability, same price.Model defaults
global.anthropic.claude-opus-4-6-v1global.anthropic.claude-sonnet-4-6global.anthropic.claude-sonnet-4-6Tested
global.model IDs viaaws bedrock-runtime invoke-modelcontextWindow: 200000Related: #9